Optimizing large collections of continuous content-based RSS aggregation queries

نویسندگان

  • Jordi Creus
  • Bernd Amann
  • Vassilis Christophides
  • Nicolas Travers
  • Dan Vodislav
چکیده

In this article we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multi-query optimization. Users create personalized feeds by defining and composing content-based filtering and aggregation queries on collections of RSS feeds. Publishing these queries corresponds to defining views which can then be used for building new queries / feeds. This naturally reflects the publish-subscribe nature of RSS applications. The contributions presented in this article are a declarative RSS feed aggregation language, an extensible stream algebra for building efficient continuous multiquery execution plans for RSS aggregation views, a multi-query optimization strategy for these plans and a running prototype based on a multi-threaded asynchronous execution

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RoSeS: A Continuous Content-Based Query Engine for RSS Feeds

In this article we present RoSeS (Really Open Simple and Efficient Syndication), a generic framework for content-based RSS feed querying and aggregation. RoSeS is based on a data-centric approach, using a combination of standard database concepts like declarative query languages, views and multiquery optimization. Users create personalized feeds by defining and composing content-based filtering...

متن کامل

Best-Effort Refresh Strategies for Content-Based RSS Feed Aggregation

During the past several years RSS-based content syndication has become a standard technique for efficiently and timely disseminating information on the web. From a data processing perspective RSS feeds are standard XML resources which are periodically refreshed by feed aggregators for generating continuous streams of items. In this article, we study the problem of information loss in the contex...

متن کامل

DescribeX: A Framework for Exploring and Querying XML Web Collections

DescribeX: A Framework for Exploring and Querying XML Web Collections Flavio Rizzolo Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2008 The nature of semistructured data in web collections is evolving. Even when XML web documents are valid with regard to a schema, the actual structure of such documents exhibits significant variations across collections for s...

متن کامل

OptimAX: optimizing distributed continuous queries

1 Setting Fulfilling the vision of a decentralized Web of peers requires efficient mechanisms for decentralized dissemination of information. RSS feeds are part of this vision: incremental updates to XML documents are pushed from a given producer to a set of subscribers along known paths. In this work, we envision processing continuous XML queries. Such queries are expressed in some XML query l...

متن کامل

مرور مؤثر نتایج جستجوی تصاویر با تلخیص بصری و متنوع از طریق خوشه‌بندی

With unprecedented growth in production of digital images and use of multimedia references, requirement of image and subject search has been increased. Systematic processing of this information is a basic prerequisite for effective analysis, organization and management of it. Likewise, large collections of images have been made available on the Web and many search engines have provided the poss...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011